Unsupervised Morphology Rivals Supervised Morphology for Arabic MT

نویسندگان

  • David Stallard
  • Jacob Devlin
  • Michael Kayser
  • Yoong Keok Lee
  • Regina Barzilay
چکیده

If unsupervised morphological analyzers could approach the effectiveness of supervised ones, they would be a very attractive choice for improving MT performance on low-resource inflected languages. In this paper, we compare performance gains for state-of-the-art supervised vs. unsupervised morphological analyzers, using a state-of-theart Arabic-to-English MT system. We apply maximum marginal decoding to the unsupervised analyzer, and show that this yields the best published segmentation accuracy for Arabic, while also making segmentation output more stable. Our approach gives an 18% relative BLEU gain for Levantine dialectal Arabic. Furthermore, it gives higher gains for Modern Standard Arabic (MSA), as measured on NIST MT-08, than does MADA (Habash and Rambow, 2005), a leading supervised MSA segmenter.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The MIT-LL/AFRL IWSLT-2009 MT system

This paper describes the MIT-LL/AFRL statistical MT system and the improvements that were developed during the IWSLT 2009 evaluation campaign. As part of these efforts, we experimented with a number of extensions to the standard phrase-based model that improve performance on the Arabic and Turkish to English translation tasks. We discuss the architecture of the MIT-LL/AFRL MT system, improvemen...

متن کامل

Semi-Supervised Learning of Concatenative Morphology

We consider morphology learning in a semi-supervised setting, where a small set of linguistic gold standard analyses is available. We extend Morfessor Baseline, which is a method for unsupervised morphological segmentation, to this task. We show that known linguistic segmentations can be exploited by adding them into the data likelihood function and optimizing separate weights for unlabeled and...

متن کامل

Natural Language Processing Of Morphology With Linguistically Motivated Applications To German Linking Elements

A survey of the history of the learning of morphological rules is presented. Further investigation is made into the current state of NLP techniques with regards to supervised and unsupervised learning morphology. An analysis of the outstanding problem of “German linking elements” is presented and reviewed. Finally, a proposal is made with the goal of applying current morphological analysis and ...

متن کامل

An unsupervised generalized Hough transform for natural shapes

The Hough transform was originally designed to recognize arti.cal objects in images. A Hough transform for natural shapes (HTNS) was subsequently proposed, but necessitates the supervised learning of the class of shapes. Here, we extend HTNS to unsupervised pattern recognition, the variability of the object class being coded with tools originating from mathematical morphology (erosion, dilation...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012